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Claims : 

1 ^tff A system for parsing a piece of foreign language text into one or more phrases 

2 which characterize a foreign language document, the system comprising: 

3 a buffer for reading one or more words from the piece of text into the buffer until a break 

4 character is identified; 

5 a parser for identifying a phrase contained in the buffer, the phrase being a sequence of 

6 two or more words in between break characters; 

the parser further comprising means for determining the type of break character that 
follows the identified phrase and means for saving a key phrase from the buffer based on the 

: : 3 

3 

J*) determined type of break character ; 

y ' 

tip a database for storing the key foreign language phrases. 

H 2. The system of Claim 1 , wherein the buffer further comprises means for flushing 

P 

the buffer when the key phrase is stored in the database or the phrase in the buffer is deleted. 
Ji 3. The system of Claim 1 further comprising a retriever for retrieving all occurrences 

2 of the extracted phrases from the piece of text after the piece of text has been parsed. 

1 ^ A method for parsing a piece of text into one or more phrases which characterize 

2 the document, the method comprising: 

3 reading one or more words from the piece of text into a buffer until a break character is 

4 identified; 

5 identifying a phrase contained in the buffer, the phrase being a sequence of two or more 

6 words in between break characters; 
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7 determining the type of break character that follows the identified phrase; and 

8 saving a key phrase from the buffer into a database based on the determined type of break 

9 character. 

1 5 . The method of Claim 4 further comprising flushing the buffer when the key 

2 phrase is stored in the database or the phrase in the buffer is deleted. 

1 6. The method of Claim 4 further comprising retrieving all occurrences of the 

2 extracted phrases from the piece of text after the piece of text has been parsed. 

1 ^/j A system for parsing a piece of text into one or more phrases which characterize a 

if? document, the system comprising: 

-O a buffer for reading one or more words from the piece of text into the buffer until a break 

j J 4 character is identified; 

; 3 a parser for identifying a phrase contained in the buffer, the phrase being a sequence of 

M> two or more words in between break characters; 

i==7 the parser further comprising means for determining the type of break character that 

'H8 follows the identified phrase and means for saving a key phrase from the buffer based on the 

9 determined type of break character ; 

10 a database for storing the key foreign language phrases; and 

1 1 a retriever for retrieving all occurrences of the extracted phrases from the piece of text 

1 2 after the piece of text has been parsed. 

1 8. The system of Claim 7, wherein the buffer further comprises means for flushing 

2 the buffer when the key phrase is stored in the database or the phrase in the buffer is deleted. 
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1 ^[ A method for parsing a piece of text into one or more phrases which characterize 

2 the document, the method comprising: 

3 reading one or more words from the piece of text into a buffer until a break character is 

4 identified; 

5 identifying a phrase contained in the buffer, the phrase being a sequence of two or more 

6 words in between break characters; 

7 determining the type of break character that follows the identified phrase; 

^ saving a key phrase from the buffer into a database based on the determined type of break 

n9 character; and 

Co retrieving all occurrences of the extracted phrases from the piece of text after the piece of 

'ft text has been parsed. 

j-l 10. The method of Claim 9 further comprising flushing the buffer when the key 

yQ. phrase is stored in the database or the phrase in the buffer is deleted. 

1 ^K. A system for parsing a piece of text into one or more phrases which characterize 

2 the document, the system comprising: 

3 a first pass comprising means for identifying a phrase contained in a buffer wherein the 

4 phrase is a sequence of two or more words in between break characters, means for determining 

5 the type of break character that follows the identified phrase and means for saving a key phrase 

6 from the buffer based on the determined type of break character; and 
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a second pass comprising means for retrieving all occurrences of the extracted phrases 
from the piece of text. 



the document, the method comprising: 

performing a first pass through the piece of text, the first pass comprising identifying a 
phrase contained in a buffer wherein the phrase is a sequence of two or more words in between 
break characters, determining the type of break character that follows the identified phrase and 
saving a key phrase from the buffer based on the determined type of break character; and 

performing a second pass through the piece of text comprising retrieving all occurrences 
of the extracted phrases from the piece of text. 




A method for parsing a piece of text into one or more phrases which characterize 
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